Scale-rich metabolic networks: background and introduction 

Reiko TanakfE 

Bio-Mimetic Control Research Center, RIKEN, Moriyama-ku, Nagoya 463-0003, Japan 

John Doyle 

Control and Dynamical Systems California Institute of Technology, MC 107-81, Pasadena, CA 91125, USA 

(Dated: February 9, 2008) 

Recent progress has clarified many features of the global architecture of biological metabolic 
networks, which have highly organized and optimized tolerances and tradeoffs (HOT) for functional 
requirements of flexibility, efficiency, robustness, and evolvability, with constraints on conservation of 
energy, redox, and many small moieties. One consequence of this architecture is a highly structured 
modularity that is self-dissimilar and scale-rich, with extremes in low and high variability, including 
power laws, in both metabolite and reaction degree distributions. This paper illustrates these 
features using the well-understood stoichiometry of metabolic networks in bacteria, and a simple 
model of an abstract metabolism. 
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The simplest model of metabolic networks is a sto- 
ichiometry matrix, or s-matrix for short, with rows of 
metabolites and columns of reactions. For example, for 
the two reactions 

Si + ATP -> S 2 + ADP, \ 
Sz+NADH -> St + NAD, J [ ) 

among four carriers ATP, ADP, NADH and NAD and 
four other metabolites Si, S2, S3, S4, the s-matrix is given 
by 

"-11 00-11 1 T , . 

00-11 -1 1 J ' ^ > 

Metabolic stoichiometry is perhaps the most unambigu- 
ously known aspect of biological networks and makes an 
attractive basis for contras ting different approaches to 
complex networks 0, y, M, UM ■ Figure Q shows a color- 
coding of the s-matrix for H. Pylori core metabolism pj 
(all conclusions are essentially the same for the larger 
s-matrix of E. Coli), with both metabolites and reac- 
tions decomposed into modules. The function of each 
reaction module is to make output metabolites from in- 
put metabolites. For example, catabolism takes exter- 
nal nutrients and activates carriers and makes 13 pre- 
cursor metabolites, and amino acid biosynthesis outputs 
amino acids with these precursor metabolites as inputs. 
Metabolites are categorized into precursor, carrier, and 
other (than precursor and carrier) metabolites. Precur- 
sor metabolites are outputs of catabolism and are the 
starting points for biosynthesis. Carrier metabolites cor- 
respond to conserved quantities and are activated in 
catabolism and act as carriers to transfer energy and 
phosphate groups, hydrogen/redox, amino groups, acetyl 
groups, one carbon units throughout all modules. The 
list of the carrier metabolites considered here is shown in 
Table ITTT1 in appendix. (Some metabolites act as carriers 
only in some reactions, but not in others.) The other 
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(than carriers and precursors) metabolites occur primar- 
ily in separate reaction modules. 

This categorization of metabolites is compatible with 
the standard schematic 'bow-tie' structure of metabolism 
as shown in Fig. [21 where a large 'fan-in' of nutrient 
inputs is catabolized to produce a small handful of ac- 
tivated carriers and precursor metabolites, which then 
'fan-out' to the biosynthesis of a large number of pri- 
mary building blocks 6] . The biologically natural modu- 
lar decomposition in metabolites is thus into 'knot' (car- 
riers and precursors) and non-'knot' (others) parts of the 
'bow-tie.' Further examination of this structure of sto- 
ichiometry shows it to facilitate a variety of highly or- 
ganized and optimized tolerances and tradeoffs (HOT) 
[4| for flexibility, adaptability, efficiency, robustness, and 
evolvability [5|, |fj( in the face of a large number of con- 
straints on conserved quantities. Thus it is an architec- 
ture that is ubiquitous throughout biology and advanced 
technologies as well. While this is all a network-level 
interpretation of standard textbook biochemistry, statis- 
tical studies [3 of 80 fully sequenced organisms produces 
similar conclusions about the universal 'bow-tie' struc- 
ture of metabolism. 

The information conveyed in the s-matrix can be rep- 
resented in another graphical form as in Fig. [3| for the 
simple system in (1). This is a color-coded bipartite 
graph of reaction and metabolite nodes which we will call 
an s-graph. The metabolite nodes can be further differ- 
entiated into those for non-carrier and carrier metabo- 
lites. Some carrier metabolites are always involved in 
reactions as a pair and thus can be combined to sim- 
plify the s-graph (Fig. fright)). Models which further 
reduce s-graphs to simple graphs, as is standard in the 
physics literature 0, 0, with only either metabolites or 
reactions (by elimination of the other) destroy their bio- 
chemical meaning. All the information in the s-matrix 
are conveyed to s-graphs by the same color-coding as in 
the matrix with the same importance in rows (metabo- 
lites) and columns (reactions). An example of an s-graph 
is shown in Fig. 0] for a part of amino acids biosynthesis 
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FIG. 1: S-matrix for H. Pylori metabolism of with 325 metabolites and 315 reactions, with functional decomposition into 
modules. Red and blue correspond to positive and negative elements, respectively, for irreversible reactions, and pink and 
green correspond to positive and negative elements, respectively, for reversible reactions. Reactions (columns) have standard 
functional modules of catabolism and biosynthesis, which is further split into amino acid, nucleotide, fatty acid/lipid/cell 
structures, and cofactor biosynthesis. The rows (metabolites) are arranged by their role in reaction modules to clarify the 
sparsity pattern of long chains of successive reactions from inputs to outputs in each module. The bottom rows are precursor 
metabolites and carrier metabolites, which appear throughout different reaction modules. 




FIG. 2: Schematic drawing of global "bow-tie" structure in 
general metabolic networks. 



module in H. Pylori. 

In studying degree statistics for s-graphs, degrees 
(number of edges from a node) for both types of nodes, 
reaction and metabolite, are important (and equivalent 
to degrees of columns and rows of the s-matrix). Of 
particular interest is the claim that metabolite degrees 
obey a power law, which is reasonably consistent with the 




FIG. 3: S-graph representation of enzymatically catalyzed 
reactions Q with s-matrix An S-graph consists of reac- 
tion nodes (black diamond), non-carrier metabolite nodes (or- 
ange square) , and carrier metabolite nodes (light blue square) . 
Edges are color coded as in the s-matrix, so all the information 
in the s-matrix appears schematically in the s-graph. This s- 
graph on the right is simplified by grouping carriers which 
always occur in pairs (ATP/ADP, NAD/NADH etc.). 



full network metabolite degrees (black +) in Fig. EJd-f), 
which shows an approximate power-law distribution in a 
log-log (e-f) rank plot, and has clearly higher variability 
than an exponential as seen in a semilog (d) plot. What 
is more fundamental than power laws is high variability. 
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TABLE I: Coefficients of variation of metabolite node degree distribution. B.S: Biosynthesis. Each of carrier, precursor, and 
other metabolites has low variability in each module, and their sum results in the high variability in total. 





Catabolism 


Amino B.S. 


Nuc. B.S. 


Lipid B.S. 


Vitamin B.S. 


All modules 


Others 


0.38 


0.49 


0.56 


0.67 


0.42 


0.61 


Precursors 


0.47 


1.05 





0.35 


0.61 


0.60 


Carriers 


0.50 


0.81 


1.23 


0.64 


0.92 


1.13 


All metabolites 


0.63 


0.88 


1.20 


0.90 


1.04 


1.72 



FIG. 4: An s-graph for the amino acid biosynthesis module 
of theH. Pylori s-matrix. The conventions are same as those 
in Fig. [3] This illustrates that long biosynthetic pathways 
build complex building blocks (in yellow on the right) from 
precursors (in orange on the left) in a series of simple reactions 
(in the middle). Each biosynthetic module has a qualitatively 
similar structure. 



For low variability processes, Gaussians arise naturally 
because of the well-known central limit theorem (CLT), 
and thus require no additional 'special' explanations. Ex- 
ponentials have other important invariance properties, 
and are also thus quite common. All degrees of each 
module in Fig. are closer to exponentials, and have 
low variability. Even more important is that relaxing fi- 
nite variance in the CLT yields power laws, which are 
further invariant under marginalization, mixtures, and 
maximization [lfj. Given the abundance of high vari- 
ability phenomena, power laws are an obvious null hy- 
pothesis and should properly be viewed as 'more normal 
than Normal' [T^. Thus we will focus on the mechanism 
responsible for low variability in reaction and module 
metabolite degrees, yet high variability in total metabo- 
lite degrees. 

Table [I] shows the coefficient of variation (CV= a/fi 
where /i and a are sample mean and standard devia- 
tion) for the horizontal and vertical decomposition of the 
s-matrix in Fig. ^ The CV is a standard measure of vari- 



ability with low variability exponentials having CV= 1, 
and power laws having divergent CV for large data. The 
only high variability in Table [I] appears for all metabo- 
lites in the full network (all modules). It is obvious from 
Fig. [51(d), which shows the decomposition of metabolites 
into carrier (o), precursor (o), and other metabolites (*), 
that the high variability in the whole network is mainly 
created by high a from carrier metabolites mixed with 
low fj, from others. Figure G^a) shows the decomposi- 
tion of carrier degrees into reaction modules. The larger 
marker corresponds to the degree in the whole network, 
whereas the smaller ones correspond to those in each re- 
action module. The sum of shared carrier metabolites 
across different reaction modules pushes the total de- 
gree of carriers much higher. In contrast, the degrees 
for other metabolites (*) stays smaller with many low 
degrees in total (Fig. Eld)). Its decomposition into re- 
action modules is shown in Fig. |Sfc). As they appear 
almost uniquely in each reaction module, the sum across 
different modules increases the number and thus ranks, 
but not greatly the degrees. The node degrees for precur- 
sor metabolites have properties between those of carriers 
and others (Fig. EJb)). The same structure is found in 
E.Coli(Fig.\$. 

The overall high variability and thus apparent power- 
law is created by a mixture of the high degree of the sum 
of degrees of a few shared carriers with the many (high- 
rank and) low degree of other metabolites unique to each 
reaction module, with the precursors filling in between 
(Fig. Eld)). Figure ^f) and the bottom row of Table |T| 
show another decomposition of the all metabolites in full 
network (+) into reaction modules, each of which has 
relatively low variability. The entire network consists of 
widely different scales, and thus could be called scale- 
rich. 

The reaction node degrees which are the number of 
metabolites that are involved in each reaction in Fig. ^ 
are shown in Fig. [7] The number of carriers involved 
in a reaction is also an important statistic. The typical 
reaction has four metabolites of which two are carriers, 
and no reactions differ greatly from this. Overall there is 
very low variability in reaction degrees, and this too can 
be explained by standard biochemistry. The enzymes 
of core metabolism are highly efficient and specialized 
for high fluxes of small metabolites and thus necessarily 
have few metabolites and involve simple reactions. This 
is not trivial, since the general purpose polymerases, cha- 
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FIG. 5: Rank (cumulative distribution) of metabolite node degree (= number of reactions = number of links) for metabolic 
networks of H. Pylori. Degrees of (a) carrier, (b) precursor, and (c) other metabolites in the whole network (large marker with 
(a) blue, (b) red, (c) dark green) and in each reaction module (small markers with pink, dark red, brown, orange, light green 
colors). Each module shows exponential distribution, (d) Metabolite node degrees of the whole network (black +) resulting 
from the mixture of carrier (o), precursor (o), and other metabolites (*), for which the plot is the same as for (a), (b), and 
(c), respectively, (e) Loglog plot of (d) indicates total degrees are approximately power laws, (f) The total metabolites in each 
reaction module with exponential distribution sums up to create the power law distribution in the whole network. 




FIG. 6: Rank (cumulative distribution) of metabolite node 
degree for E.Coli metabolism. CV= 2.05. 
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FIG. 7: Reaction node degree (= number of substrates) distri- 
bution for metabolic networks of H. Pylori. CV= 0.30. Con- 
tributions of carrier metabolites to degrees is also indicated. 



parones, and proteases involved elsewhere in the cell have 
an almost unlimited number of distinct macromolecular 
substrates. 

A simple model shows the essential constraints that 
drive the structure of the network. In fact, the only con- 
straint that we will need to model from real metabolism 
is that each reaction has few substrates split between 
shared carriers, precursors, and others. While this or- 
ganizational structure has many additional benefits in 
terms of efficiency, robustness, and evolvabilityQ, we will 



only consider how shared common carriers make high 
variability at the full system level despite low variabil- 
ity within metabolite modules. To emphasize the point 
we will assume that each reaction has exactly one global 
carrier and one other metabolite, that there is just one 
carrier and it appears in every reaction, and that each 
other metabolite is in just one reaction. With these as- 
sumptions, in r reactions, the a and therefore the CV of 
both the carrier and other metabolites is exactly 0, the 
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lowest CV value possible. The mixture of carriers and 
others has one degree r carrier and r degree 1 others. 
For large r this gives /z « 2 and a w y/r so the total 
CVw ■s/r/2. This is the highest possible CV value that 
the metabolites in a nontrivial r reaction s-matrix can 
have. Table UTI shows CV for the case with m, such reac- 
tions in module i and r = ^]m,. This simple model thus 
shows the high variability at the full system despite low 
variability within modules. 

These assumptions are so extremely simplified that 
they would not even allow reactions to chain together 
to create pathways, but this underscores the point that 
the mechanism at work here depends minimally on the 
properties of metabolism per se. It simply requires a 
common carrier, as is found in almost all advanced tech- 
nologies, as well as all metabolisms. Real s-matrices 
have broader distributions on both metabolites and re- 
actions and this smears out the distributions and lowers 
the CV, but the qualitative features are universal and 
preserved. E. Coli has similar modularity but more re- 
actions than H. Pylori, and thus a higher total CV> 2. 
The strong invariance properties of power laws means 
that they can be easily caused by models based on only 
the most minimal constraints of real metabolism, once 
they have high CV. Far from self-similar or scale-free, 
these highly structured, 'scale-rich,' and self-dissimilar 
features of both the real data and the simple model are 
the intrinsic features of metabolic networks. The high 
variability is thus due to the highly optimized and struc- 
tured protocol that uses common carriers and precursor 
metabolites 6], and power laws are simply the natural 
null statistical hypothesis for such high variability data. 
They require no further explanation beyond this natural 
biological one. 



TABLE II: CV for simple model. 
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In conclusion, this paper has shown that an appro- 
priately arranged s-matrix and its corresponding s-graph 
representation enable the clear visualization of the global 



bow-tie structure and reflect directly the underlying bio- 
chemical mechanism that gives power-law metabolite 
node degree distributions for the entire network. The 
decomposition of reactions and metabolites in a biochem- 
ically meaningful way elucidates the scale-rich structures 
of the network, leading naturally to power law degree 
distribution for metabolite nodes. This already shows a 
clear contrast between real biological networks and mod- 
els that ignore functional requirements and chemical con- 
straints to produce power law degree distribution through 
random processes [8J , although this contrast deserves fur- 
ther exploration and exposition. 

Appendix: HOT bowtie structure 

The robustness of the bowtie structure with a small 
knot of common currencies (carriers and precursors) 
is that it facilitates control, accommodating perturba- 
tions and fluctuations on many time and spatial scales. 
While metabolism allows large fluctuations in nutri- 
ents and products, relatively small fluctuations in ATP 
are lethal. But the very architecture that creates this 
fragility also helps alleviate it, since ATP concentrations 
are tightly regulated and not easily changed. Another 
major source of fragility is that universal common curren- 
cies responsible for robustness are easily hijacked by par- 
asites or used to amplify pathologic processes. Together 
the efficiency and adaptability of metabolism along 
with its fragilities illustrate Highly/Heterogeneous Op- 
timized/Organized Tradeoffs/Tolerance (HOT) 4] . The 
metabolism bowtie architecture and associated proto- 
cols allow highly optimized tradeoffs between multiple 
requirements, such as reaction complexity (number of 
substrates in reaction), genome size, efficiency (energy 
required for each reaction), but particularly adaptabil- 
ity through tolerance of various perturbations and evolv- 
ability on longer time scales. Some general consequences 
of a HOT architecture are clear. For example, if every 
nutrient-product combination had independent pathways 
without shared precursors and carriers, the total genome 
would be vastly larger, and/or enzymes would be vastly 
more complex. In either case, adaptation to fluctuat- 
ing environments on any time scale would be difficult. 
Only an organization like the bowtie facilitates the kind 
of extreme heterogeneity that allows for robust regula- 
tion, manageable genome sizes, and biochemically plau- 
sible enzymes. 
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TABLE III: List of carrier metabolites. 



Phosphate group transfer 


ATP/ADP/AMP 


Hydrogen transfer 


NADH/NAD, NADPH/NADP, FADH/FAD, OTHIO/RTHIO, MK/MKH 2 


Amino group transfer 


AKG/GLU 


Acetyl group transfer 


ACCOA/COA 


One carbon unit transfer 


THF /METTHF/FTHF /MTHF /METHF 


Others 


C0 2 ,NH3,02,H 2 02, H2CO3 H 2 S, H 2 S0 3 , N0 2 
Sulfate, Acetate, H + , Phosphate, Pyrophosphate, ACP 
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